Using compatible shape descriptor for lexicon reduction of printed Farsi subwords
نویسندگان
چکیده
This Paper presents a method for lexicon reduction of Printed Farsi subwords based on their holistic shape features. Because of the large number of Persian subwords variously shaped from a simple letter to a complex combination of several connected characters, it is not easy to find a fixed shape descriptor suitable for all subwords. In this paper, we propose to select the descriptor according to the input shape characteristics. To do this, a neural network is trained to predict the appropriate descriptor of the input image. This network is implemented in the proposed lexicon reduction system to decide on the descriptor used for comparison of the query image with the lexicon entries. Evaluating the proposed method on a dataset of Persian subwords allows one to attest the effectiveness of the proposed idea of dealing differently with various query shapes. Keywords— Lexicon reduction, Shape description, Compatible descriptor, Persian, Farsi
منابع مشابه
Search Space Reduction for Farsi Printed Subwords Recognition by Position of the Points and Signs
In the field of the words recognition, three approaches of words isolation, the overall shape and combination of them are used. Most optical recognition methods recognize the word based on break the word into its letters and then recogniz them. This approach is faced some problems because of the letters isolation dificulties and its recognition accurcy in texts with a low image quality. Therefo...
متن کاملDetection and compensation of undesirable discontinuities within the farsi/arabic subwords
In this paper, an unexplored subject in the domains of Farsi/Arabic handwritten word preprocessing is introduced. Subwords play a vital role in many applications such as cheque amount recognition, text recognition, lexicon reduction and subword-based word recognition. Correcting the faults occurred in subwords will improve the overall performance of these applications. A subword is a connected-...
متن کاملArabic word descriptor for handwritten word indexing and lexicon reduction
Word recognition systems use a lexicon to guide the recognition process in order to improve the recognition rate. However, as the lexicon grows, the computation time increases. In this paper, we present the Arabic word descriptor (AWD) for Arabic word shape indexing and lexicon reduction in handwritten documents. It is formed in two stages. First, the structural descriptor (SD) is computed for ...
متن کاملیک روش دو مرحلهای برای بازشناسی کلمات دستنوشته فارسی به کمک بلوکبندی تطبیقی گرادیان تصویر
This paper presented a two step method for offline handwritten Farsi word recognition. In first step, in order to improve the recognition accuracy and speed, an algorithm proposed for initial eliminating lexicon entries unlikely to match the input image. For lexicon reduction, the words of lexicon are clustered using ISOCLUS and Hierarchal clustering algorithm. Clustering is based on the featur...
متن کاملA Study on Farsi Handwriting Styles for Online Recognition
Knowing varieties of writing a letter in a word or a subword in different handwriting styles is very beneficial in recognition specifically for online recognition. In this paper, TMU-OFS dataset consisting of 1000 frequent Farsi subwords is employed to study Farsi handwriting styles. The subwords are grouped based on their delayed strokes and their main bodies, separately. The handwriting style...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1601.06251 شماره
صفحات -
تاریخ انتشار 2016